Skip to content

An aws lambda function for rendering prefigure#920

Open
dqnykamp wants to merge 5 commits intoDoenet:mainfrom
dqnykamp:prefigure-lambda
Open

An aws lambda function for rendering prefigure#920
dqnykamp wants to merge 5 commits intoDoenet:mainfrom
dqnykamp:prefigure-lambda

Conversation

@dqnykamp
Copy link
Copy Markdown
Member

@dqnykamp dqnykamp commented Mar 11, 2026

This PR add code that was used to create the prefigure.doenet.org endpoint for rendering the PreFigure source XML file to the resulting SVG and annotations file. It includes the Dockerfile used for the endpoint and instructions on how to deploy to it. It also include a .yml for eventually deploying this via cloudformation, though that .yml is just an AI summary of the steps we took and has not been validated.

@dqnykamp dqnykamp changed the title Ability to render graphs with PreFigure An aws lambda function for rendering prefigure Mar 11, 2026
@dqnykamp dqnykamp marked this pull request as ready for review March 11, 2026 19:25
@dqnykamp dqnykamp requested a review from Copilot March 11, 2026 19:29
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an AWS Lambda (container image) service and supporting artifacts to power the prefigure.doenet.org/build endpoint that renders PreFigure XML into SVG plus optional annotation XML, with a DynamoDB-backed cache.

Changes:

  • Introduces a Python Lambda handler that runs prefig build, returns a JSON contract, and caches results in DynamoDB (plus in-memory L1).
  • Adds a Dockerfile to build the Lambda container image with PreFigure + native dependencies.
  • Adds deployment/testing assets: a CloudFormation template draft, endpoint testing checklist, and a browser-based test page.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
prefigure-lambda/app.py Lambda handler: request parsing, invoking prefig, response shaping, and hybrid (RAM+DynamoDB) caching.
prefigure-lambda/Dockerfile Container build for Lambda runtime including liblouis/pycairo/prefigure and prefig init.
prefigure-lambda/prefigure-stack.yml Draft CloudFormation template for Lambda + DynamoDB + HTTP API + custom domain mapping.
prefigure-lambda/ENDPOINT_TESTING.md Manual verification checklist and curl/jq snippets for endpoint behavior.
prefigure-lambda/test-prefigure.html Simple browser client to POST XML to /build and render returned SVG/annotations.
.gitignore Normalizes .cspell ignore entry and adds Python __pycache__ ignores.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread prefigure-lambda/app.py
Comment on lines +160 to +169
# 5. Run Prefigure
cmd = ["prefig", "build", input_filename]

# We switch CWD to work_dir so 'output/' is created there
result = subprocess.run(
cmd,
cwd=work_dir,
capture_output=True,
text=True
)
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subprocess.run(...) has no timeout. If prefig build hangs (e.g., pathological input), the invocation will run until Lambda timeout and waste concurrency. Consider passing an explicit timeout and handling subprocess.TimeoutExpired with a clear errorCode (and possibly killing the process group).

Copilot uses AI. Check for mistakes.
Comment thread prefigure-lambda/app.py
Comment on lines +16 to +36
LOCAL_CACHE = {}
dynamodb = boto3.resource('dynamodb')
table_name = "PrefigureCache"
table = dynamodb.Table(table_name)

DEFAULT_HEADERS = {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*'
}

# --- HELPER FUNCTIONS ---
def compute_hash(content):
unique_string = content + CACHE_VERSION
return hashlib.sha256(unique_string.encode('utf-8')).hexdigest()

def get_from_cache(xml_hash):
# 1. Check L1 (RAM)
if xml_hash in LOCAL_CACHE:
print(f"L1 MEMORY HIT: {xml_hash}")
return LOCAL_CACHE[xml_hash]

Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOCAL_CACHE grows without bounds across warm invocations. A high-cardinality workload (or malicious traffic) can cause the container to retain many large SVG/XML strings and eventually OOM. Consider using a bounded cache (LRU with max entries/bytes) or making the in-memory layer optional via configuration.

Copilot uses AI. Check for mistakes.
Comment thread prefigure-lambda/app.py
# --- INITIALIZATION ---
LOCAL_CACHE = {}
dynamodb = boto3.resource('dynamodb')
table_name = "PrefigureCache"
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DynamoDB table name is hard-coded (PrefigureCache). This makes it harder to deploy multiple environments/stacks (dev/stage/prod) or to rename the table without code changes. Consider reading the table name from an environment variable (with a default) and wiring it in the CloudFormation template.

Suggested change
table_name = "PrefigureCache"
table_name = os.getenv("CACHE_TABLE_NAME", "PrefigureCache")

Copilot uses AI. Check for mistakes.
Comment on lines +37 to +57
# 3. Install Liblouis (Braille support) from Source
# (Standard yum/dnf does not have liblouis, so we build it)
WORKDIR /tmp/liblouis-build
RUN git clone https://github.com/liblouis/liblouis.git . && \
./autogen.sh && \
./configure --enable-ucs4 --prefix=/usr && \
make && \
make install && \
cd python && \
pip install . && \
cd / && \
rm -rf /tmp/liblouis-build

# 4. Install Pycairo explicitly
# We do this before prefigure to ensure the C compilation succeeds.
RUN pip install pycairo

# 5. Install Prefigure
# We use the [pycairo] extra to tell prefig we have it.
RUN pip install "git+https://github.com/davidaustinm/prefigure.git#egg=prefig[pycairo]"

Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Docker build pulls source directly from GitHub default branches (git clone liblouis and pip install git+.../prefigure.git) without pinning to a tag/commit. That makes builds non-reproducible and increases supply-chain risk if upstream changes. Consider pinning to specific versions/SHAs (and optionally verifying checksums) so the deployed Lambda image is deterministic.

Copilot uses AI. Check for mistakes.
Comment on lines +135 to +150
const svg = data.svg;

if (svg) {
svgContainer.innerHTML = svg;
} else {
svgContainer.textContent =
"Error: No SVG found in response: " + JSON.stringify(data);
}
const cml = data.xml;

if (cml) {
cmlContainer.innerHTML = cml;
} else {
cmlContainer.textContent =
"Error: No CML found in response: " + JSON.stringify(data);
}
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test client renders API-provided strings with innerHTML (svgContainer.innerHTML = svg and cmlContainer.innerHTML = cml). If the response ever contains unexpected markup (especially SVG with scripts/event handlers), opening this page can execute it. Consider sanitizing before injecting, or using textContent for the XML/annotations display (and a safer SVG parsing approach if you need to render SVG).

Copilot uses AI. Check for mistakes.
Comment on lines +33 to +37
<script
type="text/javascript"
src="https://cdn.jsdelivr.net/npm/diagcess@1.4.0/dist/diagcess.js"
defer
></script>
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The page loads diagcess.js from a third-party CDN without Subresource Integrity (SRI). Even for a test harness, consider adding an integrity hash + crossorigin attribute or documenting why this is acceptable, to reduce supply-chain risk.

Copilot uses AI. Check for mistakes.
Description: The ARN of the ACM Certificate for the domain (must be in the same region)

HostedZoneId:
Type: String
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HostedZoneId is described as optional, but as a CloudFormation parameter with no Default it becomes required at deploy time even though the DnsRecord resource is commented out. Consider either removing this parameter until the DNS resource is enabled, or give it a default (e.g. empty string) and gate the DNS record behind a Condition so stacks can be created without supplying a zone id.

Suggested change
Type: String
Type: String
Default: ""

Copilot uses AI. Check for mistakes.
Comment thread prefigure-lambda/app.py
def lambda_handler(event, context):
debug = False
query_params = event.get('queryStringParameters') or {}
if query_params.get('debug') in ('1', 'true', 'True', 'yes'):
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The debug query param enables returning stdout/stderr, working directory paths, and directory listings to any caller. If this endpoint is public, this is an information disclosure risk; consider disabling debug in production (env flag), requiring an auth token/header, or returning only a request id while logging full diagnostics to CloudWatch.

Suggested change
if query_params.get('debug') in ('1', 'true', 'True', 'yes'):
# Only allow debug mode if explicitly enabled via environment variable.
allow_debug_env = os.environ.get('ALLOW_DEBUG', '').lower()
if allow_debug_env in ('1', 'true', 'yes') and query_params.get('debug') in ('1', 'true', 'True', 'yes'):

Copilot uses AI. Check for mistakes.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 13, 2026

⚠️ No Changeset found

Latest commit: d660a86

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants